1 


GCGGCAGCGG 


CGGCGGCTGA 


GGAGGGCCCG 


GO CTGCGAGA 


GCCTCAGTGG 


51 


GAGCCGGCTC 


AGCCCTCGGC 


CACCATGTCG 


GCGCCGTCGG 


AGGAGGAGGA 


101 


GTACGCGCGG 


CTGGTGATGG 


AGGCGCAGCC 


GGAGTGGCTG 


CGCGCCGAGG 


151 


TGAAGCGGCT 


GTCCCACGAG 


CTGGCCGAGA 


CCACGCGTGA 


GAAGATCCAG 


201 


GCGGCCGAGT 


ACGGGCTGGC 


GGTGCTCGAG 


GAGAAGCACC 


AGCTCAAGCT 


251 


GCAGTTCGAG 


GAGCTCGAGG 


TGGACTATGA 


GGCTATCCGC 


AGCGAGATGG 


301 


AGCAGCTCAA 


GGAGGCCTTT 


GGACAAGCAC 


ACACAAACCA CAAGA^GGTG 


351 


GCTGCTGACG 


GAGAGAGCCG 


GGAGGAGAGC 


CTGATCCAGG 


AGTCGQCCTC 


401 


CAAGGAGCAG 


TACTACGTGC 


GGAAGGTGCT 


AGAGCTGCAG 


ACGGAGCTGA 


451 


AGCAGTTGCG 


CAATGTCCTC 


ACCAACACGC 


AGTCGGAGAA 


TGAGCGCCTG 


501 


GCCTCTGTGG 


CCCAGGAGCT 


GAAGGAGATC 


AACCAGAATG 


TGGAGATCCA 


551 


GCGTGGCCGC 


CTGCGGGATG 


ACATCAAGGA 


GTACAAATTC 


CGGGAAGCTC 


601 


GTCTGCTGCA 


GGACTACTCG 


GAACTGGAGG 


AGGAGAACAT 


CAGCCTGCAG 


651 


AAGCAAGTGT 


CTGTGCTCAG 


ACAGAACCAG 


GTGGAGTTTG 


AGGGCCTCAA 


701 


GCATGAGATC 


AAGCGTCTGG 


AGGAGGAGAC 


CGAGTACCTC 


AACAGCCAGC 


7S1 


TGGAGGATGC 


CATCCGCCTC 


AAGGAGATCT 


CAGAGCGGCA 


GCTGGAGGAG 


801 


GCGCTGGAGA 


CCCTGAAGAC 


GGAGCGCGAA 


CAGAAGAACA 


GCCTGCGCAA 


851 


GGAGCTGTCA 


CACTACATGA 


GCATCAATGA 


CTCCTTCTAC 


ACCAGCCACC 


901 


TGCATGTCTC 


GCTGGATGGC 


CTCAAGTTCA 


GTGACGATGC 


TGCCGAGCCC 


951 


AACAACGATG 


CCGAGGCCCT 


GGTCAATGGC 


TTTGAGCACG 


GCGGCCTGGC 


1001 


CAAGCTGCCA 


CTGGACAACA 


AGACCTCCAC 


GCCCAAGAAG 


GAGGGCCTCG 


1051 


CACCGCCCTC 


CCCCAGCCTC 


GTCTCCGACC 


TACTCAGTGA 


GCTCAACATC 


1101 


TCTGAGATCC 


AGAAGCTGAA 


GCAGCAGCTG 


ATGCAGATGG 


AGCGGGAAAA 


1151 


GGCGGGCCTG 


CTGGCAACGC 


TGCAGGACAC 


ACAGAAGCAG 


CTGGAGCACA 


1201 


CGCGGGGCTC 


CCTGTCAGAA 


CAGCAGGAGA 


AGGTGACCCG 


CCTCACAGAG 


1251 


AATCTGAGTG 


CCCTGCGGCG 


CCTGCAGGCC 


AGCAAGGAGC 


GGCAGACAGC 


13 01 


CCTGGACAAC 


GAGAAGGACC 


GTGACAGCCA 


TGAGGATGGG 


GACTACTACG 


1351 


AGGTGGACAT 


CAACGGGCCT 


GAGATCTTGG 


CCTGCAAGTA 


CCATGTGGCT 


1401 


GTGGCTGAGG 


CTGGCGAGCT 


CCGCGAGCAG 


CTCAAGGCAC 


TGCGCAGCAC 


1451 


GCACGAGGCT 


CGTGAGGCCC 


AGCACGCCGA 


GGAGAAGGGC 


CGCTATGAGG 


1501 


CTGAGGGCCA 


GGCACTCACG 


GAGAAGGTCT 


CCCTGCTAGA 


GAAGGCCAGC 
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2 h'P 




1 CC1 

J. O D J. 


^- G v- v_ AGG ACC 


GCGAGCTGCT 


GGCCCGGCTG 


GAGAAGGAGC 


TAAAGAAGGT 


loux 


G AG\. GACGTC 


GCCGGCGAGA 


CACAGGGCAG 


CCTGAGTGTG GCCCAGGATG 


lb j 1 


AUC TGGTG AC 


CTTCAGTGAG 


GAGCTGGCCA 


ATCTCTACCA 


CCACGTGTGC 


X / U X 


A I\jTGCAACA 


ATGAGACACC 


CAACCGTGTC 


ATGCTGGACT 


ACTACCGCGA 


1 TCI 

1 / Ol 


s** jr-i m ,fi 

GGGC C AGGGC 


GGGGCCGGCC 


GCACCAGTCC 


CGGGGGCCGC 


ACCAGCCCCG 


i q m 
loUl 


AGGCGCGTGG 


CCGGCGCTCA 


CCCATCCTCC 


TACCCAAGGG 


GCTGCTGGCT 


loox 


CCTGAGGCGG 


GCCGAGCAGA 


TGGTGGGACG 


GGGGACAGCA 


GCCCCTCGCC 


iyui 


TGGCTCCTCA 


CTGCCATCAC 


CCCTGAGTGA 


CCCACGCCGG 


GAGCCCATGA 


1 O C 1 


ACATCTACAA 


CCTGATCGCT 


ATCATCCGTG 


ACCAGATCAA GCACCTGCAG 


2001 


GCAGCCGTGG 


ACCGCACCAC 


GGAGCTGTCA 


CGCCAGCGCA 


TTGCCTCTCA 


'2 051 


GGAGCTGGGC 


CCCGCCGTGG 


ACAAGGACAA 


GGAAGCGCTT 


ATGGAGGAGA 


2101 


TCCTCAAGCT 


GAAGTCGCTG 


CTCAGCACCA 


AGCGGGAGCA 


GATCACCACG 


2151 


CTGCGCACTG 


TGCTCAAGGC 


CAACAAGCAG 


ACGGCCGAGG 


TGGCCCTTGC 


2201 


CAACCTGAAG 


AGCAAGTATG 


AGAATGAGAA 


GGCCATGGTT 


ACCGAGACCA 


2251 


TGATGAAGCT 


GCGCAATGAG 


CTCAAGGCCC 


TCAAGGAGGA 


CGCAGCCACC 


2301 


TTCTCCTCGC 


TGCGTGCTAT 


GTTTGCCACC 


AGGTGTGACG 


AGTACATTAC 


2351 


ACAGCTGGAT 


GAGATGCAGC 


GGCAGCTGGC 

/ 


GGCTGCTGAG 


GACGAGAAGA 


2401 


AGACGCTGAA 


CTCGCTGCTG 


CGCATGGCCA 


TCCAGCAGAA 


GCTGGCGCTG 


2451 


ACCCAGCGGC 


TGGAGCTGCT 


CGAGCTGGAC 


CATGAGCAGA 


CCCGGCGTGG 


2501 


CCGTGCCAAA 


GCCGCCCCGA 


AGACCAAGCC 


AGCCACACCG 


AGCCTGTAGA 


2551 


GTAGCTGCCA 


GGAGGACTTG 


GCCACCCGGC 


CCTGTCACAC 


TGCAGCCCCT 


2601 


TCCCCTTCCC 


TCTCGTGGCC 


CACAAGGAGG 


AAGGAAGGGC 


AACCTAAAAG 


2651 


CCCACTTAGA 


AACTTTTTGG 


ATATGCCACT 


GCAATTCTTT 


TCAAAATAGC 


2701 


ATTCCCCAGG 


TTTTTAATGG 


GAGGAAAAAA 


AGCTTTAATG 


TTGAGCATGC 


2751 


TGCGAGCTGC 


TGCGTGGAAA 


GGCCTCTGTA 


TGGGCCGAAG 


ACCCTTCTTC 


2801 


CCTGGCTGCC 


AGGCTCGCCA 


GGAGCCCACT 


GGAAACGCCC 


ACCACGGGGG 


2851 


CTCCTTGTTA 


CACATGTTCT 


TTTTTTATCC 


GATCAACCTG 


TGCACTTTTG 


2901 


ATATTTTGAT 


ATTATATTTG 


CTTCCTTAAT 


TCCTCGCGTA 


GAGACGGTCT 


2951 


CAGGTGCCGT 


GGTCTATGCT 


CGTGGTCCTG 


TAGCTGTCCG 


CCTCAGCTCC 


3001 


CACCGTGTTT 


GTCTGGTGTC 


AGCACGAGGC 


AGAGCTGTGT 


GCTCCATAGC 


3051 


GTGTAGCTTT 


AGACTCGGAG 


ATGAGTGCTT 


TGACCCAGCG 


AGGAGCTCAG 


3101 


CTAAGTGTAT 


CCACGCTGTG 


GTTCAGCAGC 


CTTTAGATCA 


TACGGCATTG 


3151 


TGGTTCATGT 


TTGAAATTAC 


AGATTTTAAA 


TGCCATGTTC 


ATTAAGAAAT 



3/iv 



3201 


CCAGGGTATT 


CAGATTCTGG 


GGTTTTTCAT 


1 ATTGTATTAT TATTATTCTT 


3251 


AGGAATAGTT 


CAATGTAACA 


AGAAGAAAAC 


TTGACCTTTG CTCTGGTTAA 


3301 


AACAGTAATA 


GGCACTTGAA 


AAAAAAAGAT 


AAATTATTGA ATGAGTAGTA 


3351 


TTACCTACAA 


ATTCCAGAAT 


TTTCTGGGTT 


TTAGGACGTT GTGAAGCATG 


3401 


ACTGATTAAC 


AGAATTTTAT 


ACAACTGTAC 


CAATAAAATT CCAAATTGGA 


3451 


ATTGTTTTGT 


TACTCTGGTT 


GTTGTGCCAA 


ATTGTGGTAC ACTTAGAAAA 


3501 


TTCTACAGTC 


GTCGATTTTT 


AGGGTGTTCT 


CTTTCAACAC CTTTTTGTTA 


3551 


GTAATCATTG 


CCAGTAGTGC 


CTTCATCAGT 


TAAGGGAGGT GTCCCAGCAC 


3601 


AGATCATTCT 


CAAAAGCGAG 


CAGGGAAGAG 


CTAGTGGGCA TGCTGAAGGC 


3651 


CAGCGTGGAC 


AGCAGGTGAG 


GCAGGTGCTC 


CTCACACCCA GACCTGGGCA 


3701 


TCTTCATTGA 


GGGAAAGAAA 


ACAGTCATTG 


TGCAAAATTC TGTTAGTCAG 


3751 


TGATTCTTTA 


CTTGCAAATT 


CAGGGGCTTA 


GAAAATGAAA GCAAACACAA 


3801 


AACCTTGAGT 


GTGCTTTGGG 


AACCAAATGG 


ACCTTCTGGG ACAAGCTGAG 


3851 


CAAGCTGTAT 


GAACGCCACG 


TTTGTGAAGA 


GCTGAGGGTA TCAGGAGGGC 


3901 


CGACGCTGTG 


TTGGCATGCG 


CAGTAGGGGA 


TGAGGGTTAG CCATAGTATT 


3951 


CTTTGCAAAT 


GTGAAAGCGA 


GACATTATAT 


CTTCTCTTGC TTGGTGTAAC 


4001 


TAATCACTGT 


TAATTTCAGG 


AAACAGAACT 


CATTAAAACT CCTTAGCAAA 


4051 


CCAGGTCTAC 


ATCCTGTTTT 


GTTTGCTGAG 


TGAGGTTAGT GGGAGTGGTC 


4101 


AAATTGGTAC 


TCTTGGAGGA 


AGAAAAACTG 


TCCTTCCTTC TCCAAAAAAG 


4151 


GAAAAATTAT 


AATAATATAA 


ATGACAAAAA 


TAAAAGAATT CTGTTTCCTG 


4201 


GAATAAGCAT 


TTCTTATTCC 


TAGTTGTAGG 


GACTCCTATT TTTACCTTCC 


4251 


GTTACAGTGT 


TGATTCATAA 


GAAATATTGT 


TACATTTGAG ATAACTTCAT 


4301 


CTGTATGGGG 


TATTTATTTG 


CAATGATGTC 


TGAGTACTGT ATTTTTTCTG 


4351 


TGCATTACCT 


TAGTGTCAGA 


ATGTTGGTCT 


TTATTTTAAA GTCATATGCA 


4401 


TGTTCTCCTG 


CCAAGGAACC 


TTTACACAGA 


CCCAAACAAA AAAATAATAA 


4451 


x wu\n i. Vj\.v« X 




AC»AAAATGAG 


GCAGAGCATG GAAAAGGAAT 


4501 


AGGAAGGAGA 


AATTAATTGA 


GATTTTCAGG 


ACACAGACAT ATGATGTGAA 


4551 


TGCCTACAAA 


GCCAGTGTGC 


ATAGGAACAG 


TGGGCCTGGG TAAAGAGTCA 


4601 


CATTGGTAGG 









1 MSAPSEEEEY ARLVMEAQFE WLRAEVKRLS HELAEITOEK IQAAEVGLAV 
51 LEEKB3LKD3 FEELEVDVEA. IRSEMSQLKE ATOQAHINHK KVAAECTSRE 

101 ESLIQESASK EQYYVRKVLE VUINIQSEME KIASVAQELK 

151* EIM3WEIQR C2RLRDDIKEY KFHEARLLQD YSELEEENIS I^KQVSVLRQ 

201 M2VEFH3LKH ETKRLEEETE YLN9QLEEAI RLKEISE^QL EEALETLKIE 

251 RESKNSLKKE LSHYMSINDS FVTSHLHVSL DGLKESETftA EE*UDAEALV 

301 M3FEH3GLAK LP3X3SKTSTP KKEELAFPSP SLVSC3XSEL NISEIQKLKQ 

351 QLM3MEREKA G3J7mjQOIQ KQLEHIPGSL SEQQEKVTRL TENLSAISKL 

401 QASKEPtfEAL ENEKERDSHE DGDYYEVDIN GPEHACKXH VAVAEAGEU* 

451 EQLKALRSTH EAKEA(JiAEE KGKYEAH3QA LTEXVSLLEK ASRQCKEUA 

501 RLEKELKKVS DUAGETP2GSL SVAQCELVIF SEELANLYHH VaOWETEN 

551 KVMIXra^EG QGGPCKCSPG GRTSPEARGR FSPILLPKGL IAPEAGRADG 

601 GTTCDSSPSFG SSLPSPLSDP RREEMJIYNL IAHRDQXKH IQAAVEKTIE 

651 LSRQRIASQE DGPAVEKEKE AIMEEXLKIK SLLSIKREQI TTLKTVLKAN 

701 KOTAEVALAN UCSKYENEKA M^/TEJIM^KLR NEXKALKEEA AIFSSLRAMF 

751 AraCCEYI'iy LHM3RQLAA AHCEKKTLNS Ui^IQCKL ALTQRLELLE 
801 IXHEJQTOBGR AKAAPKTKPA 



FIGURE 2 



1 ATOTOOGrroc TOOQOGPCTA GSAGOGACAC TOOGMTOCA TCAACIU3GA 

51 ci'riUXSRQC GPCTCX30QGJG GITO003QGA CIOGRGTCDG Q3QCCI3GaG 

101 OC2GTCW3QG aOOQOGWXX: GQCX33O3O0G OC3303GAOCA a^a^ACTO 

151 CACIACATOC OCATOCXXCT CCT3Q00a3C GQCX3C3CTTOG O3GAM0CAC 

201 QCIOEAOCXX: OQCAOOGAGG ATCACICACT GOlTClU'lUi AAGGAMTTCG 

251 Arriu£O003 GCTOICIGAG AMGAAOGTC GTOATOOCTT GAATCAGA2T * 

301 GTEATICOTG CfiCTOCTOCA GCATCPCAAC ATTAITOOCT A^TACAATCA 

351 CTlCATOGfiC AAI&3CAGQC TOCIGATTCA QCT3GAA3AT 

401 OSAftOCTOIA TCACAAAATC CTTO3ICMA AGG&CAftGTT CTTIXSG3AA 

451 GMATOGfTCG TGTOGTnVXT ATTICMATT GITIXZPOCM TCMCTOCAT 

501 OGATAAAGCT OGAATOCTIC ATAGPGA3AT AAMACATIA AATATTCTIC 

551 TCACCAAG3C AAAOCIGATA AAACTTOG&G ATTATCOXT A3CAAAGAAA 

601 CTIS^ATICIG ^TATTOCAT QQCICMAOG CITCIQ3GAA OOOCAXAOTA 

651 CATCTCIOCA GPOCICICTC APGGPOEAAA CTACAA3TIC AM7ICIGAIA 

701 TCTOOQC^Gr T3XTO0GFIC A1T1T1UAAC TOCTTACCTT AAaGAGGAOS 

751 TITCMQCm CAAAOOCACT TAftOCIGICT GIGAftGAIOG TOCAMGAAT 

801 TOQQQCCATC GAAGTTTCACT COT03CAGEA CICITIQGAA TTCATOCAAA 

851 TOGrlTCATIC GrTQOCTIGWU CAGGATOCTG TZ^TOCMAT 

901 GAftCTICERG MOXXXTCT TCTCAG3AAA 03CMGaGAG AGATOSMGA 

951 AAAMnCACT CTOCTTAATO CAGCTfiCAAA GMAOCAMS 

1001 TCM7TCAMC M0CA2TGCT GIAGIAACAT GM3GAAGCBG TCAAGICTAT 

1051 Grna33GTIG GIOGAAAATC CAO0000CAG AAACIQGATO TEATCAfiGfiG 

1101 T3QCTCTAGT 0003330*33 TCICT3CM3 GAAI&O0CAC TTCT3CICT33 

1151 TCACMflOGA GAMGA££TC TW3W7TIG3G TCAW2KPQCA AQGM3CW2T 

1201 AAWrrOCATO GTC30CTO3G OCAIQGPGPC AAMOCTOCT ATOSfiCMCC 
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1251 aaagcatoig gaaaagttoc aagocaamc tmocatcag gimio 

1301 GTCATCATIT OCTClUlUr GTCACIGATO A3C3GTCROCT CrKTOOCTIC 
1351 GGATCfiGAIT ATIMX33CTC CATO3Q0CT0 GACAAAGITC CiaSOCCIt^ 
1401 AGIQCXAGAA OXATOC^C TCAACTICTT CCTCAGCAAT 
1451 AGGTICIOCTC TOGBGAIAAT CATCIGGTOG TICIGACMG AAACAAGGAA 
1501 GICTATTCTT G330CTCTOG OGAAEAITOGA OGACIO0GIT TOCMTCAGA 
1551 AGAGGATEAT TATACTOCAC AAAAGGTO3A TCITCCCAAG QOdTCATEA 
1601 TICTTOCAGT TOIGATOQGA CATITCTCIT GAOXftGICA 

1651 QQCAAAGTOC TO30CTCITJ3 ACICAATCAA TTCAMAMC TOOGICIGAA 

1701 TtaGIOCAro TC03GAATIA TCAAOCAroA AGCMMCAT GAAGTiaXT 

1751 ACACAAOGTC CTITAOCTIG QOCAAACAGT TC CT OJ l'l T i A TAAGftTOOCT 

1801 AOCAITOCXX: QGGCAAGAC TCACaCAOCT OCTAITCATO AQOGAQOOCG 

1851 QCTOOIGAOC TTK33CTOCA ACAAGICT3G GGAOCTOOQC GITOOGAACT 

1901 ACAAGAAGOO TCTOOGAATC AAOCTGITOG Q333AOCXXT TOCTQOGAAG 

1951 CMGTCATCA GC33TCTOCT3 03GTD3ATCAG , TTEMCATIG CTO0CACn3\ 

2001 TCATAATCRC AITITIUJLT QOOQCAATOG TOGTTAATOGC OQOCT3QCAA 

2051 TCAOXXXaC A3AGPGAOCA OViaXTCIG ATATCTGTOC CICATOGCCT 

2101 OOQOCZDVTIT TIQGATCICT QCATCATCTC a33GaOCT3T CTTOOCCT30 

2151 ATO3CATAOC A3TCTCATOO TO3PGAAAGT ATIGAACTCT AAGAOCATOC 

2201 GTTOCAATfiG CTGTCUCT1A TOCATTOGAA CTX3IGITTCA G2GCTCT&GC 

2251 CCQ3GW33BG 0009003333 0Q0000IOJ1' GAAGAflGfiGG ACfiGKMCA 

2301 G3AATCTCAA ACT0CIG&3C CAAGOTGMG CTIOCGftGGA ACAATOGAAG 

2351 CAGAO0GAO3 AATOGAMOT TEAATC?OrC OCACAGM0C CTTO33GA2C 

2401 AGEAATO0QG OCAGC2GCTC CTCBJL'IUJl: T33CTTOGAA A0GttXUO 3 l 

2451 AAATOCAGAA TITATO00CA TOXTCACW3 QO CA TCItXT CICTCT3C2G- 

2501 CUriTlUAGA. ATCIGAGAAA . GATAOCXTOC OCTATCAMA GCTOCAAGGA. 



2551 OCTCIGAACC TCC m u ift A CSCAAPOCCC AAGIMAAGC 

2601 CTO3ICACCT COGCTCAATC CIGCBCTAAC CTCOTCTOOG AAOOGAACAC 

2651 CACIGACKr TOCTOOGTOT QC33K3CWXT CICIQCSC3CT GGfiOSITCfiG 

2701 AGATTCCAGG GTCTOGTOIT AAAGTOICTC GCTCftACAAC AGAAQCIBCA 

2751 QCAAGAAAAC CirxaGATIT TTAOOCAACr GCAGAASITC AACftAGAAAT 

2801 TSGAAQ3M33 QCPCCP£XSIG G33ATOCATT OCAAAGGAAC TCfiGACftQCA 

2851 AAQ3AAGAGA TOSAAATOSA TOCAAAQOCT GftCTERGAIT CAGATKXTIG 

2901 GK30CTOCIG OSAACAGACT CCICTAGAOC caQOCIC'iaG 



9fiS 



1 MSVLGEYERH CDSINSDFGS ESGGCGDSSP GPSASQGPRA GGGAAEQEEL 

51 HYIPIRVLGR GAFGEATLYR RTEDDSLWW KEVDLTRLSE KERRDALNEI 

101 VILALLQHDN I I AYYNHFMD NTTLLIELEY CNGGNLYDKI LRQKDKLFEE 

151 EMWWYLFQI VSAVSCIHKA GILHRDIKTL NIFLTKANLI KLGDYGLAKK 

2 01 LNSEYSMAET LVGTPYYMSP ELC QGVKYNF KSDIWAVGCV IFELLTLKRT 

2 51 FDATNPLNLC VKIVQGIRAM EVDSSQYSLE LIQMVHSCLD QDPEQRPTAD 

3 01 ELLDRPLLRK RRREMEEKVT LLNAPTKRPR SSTVTEAPIA WTSRTSEVY 
3 51 VWGGGKSTPQ KLDVIKSGCS ARQVCAGNTH FAWTVEKEL YTWVNMQGGT 

Q 401 KLHGQLGHGD KASYRQPKHV EKLQGKAIHQ VSCGDDFTVC VTDEGQLYAF 

W 451 GSDYYGCMGV DKVAGPEVLE PMQLNFFLSN PVEQVSCGDN HWVLTRNKE 

=J5 501 VYSWGCGEYG RLGLDSEEDY YTPQKVDVPK ALIIVAVQCG CDGTFLLTQS 

Q 551 GKVL AC GLNE FNKLGLNQCM SGI INHEAYH EVP YTT S FTL AKQLSFYKIR 

6 01 TIAPGKTHTA AIDERGRLLT FGCNKCGQLG VGNYKKRLG I NLLGGPLGGK 

2? 651 QVIRVSCGDE FTIAATDDNH I F AWGNGGNG RLAMTPTERP HGSDICTSWP 

|j 7 01 RPIFGSLHHV PDLSCRGWHT ILIVEKVLNS KTIRSNSSGL SIGTVFQSSS 

O 751 PGGGGGGGGG EEEDSQQESE TPDPSGGFRG TMEADRGMEG LIS PTEAMGN 

801 SNGASSSCPG WLRKELENAE FIPMPDSPSP LSAAFSESEK DTLPYEELQG 

851 LKVASEAPLE HKPQVEASSP RLNPAVTCAG KGTPLTPPAC ACSSLQVEVE 

•901 RLQGLVLKCL AEQQKLQQEN LQIFTQLQKL NKKLEGGQQV GMHSKGTQTA 

951 KEEMEMDPKP DLDSDSWCLL GTDSCRPSL* 
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Putative GNK Domains and Structural Features 

KINASE (44-315) 
GUANINE NUCLEOTIDE EXCHANGE FACTOR (GEF) (318-605) 
GLYCINE/ACIDIC-RICH TETHER (752-764) 
C -T ERMINAL D OMAIN W ITH N O KNOWN H O MOLOGY O R FUNCTION (765-979) 

1 MSVLGEYERH CDSINSDFGS ESGGCGDSSP GPSASQGPRA GGG AAEQEEL 



51 


HYIPIRVLGR 


GAFGEATLYR 


RTEDDSLWW 


KEVDLTRLSE 


KERRDALNEI 


101 


VI LALLQHDN 


1 1 AYYNHFMD 


NTTLLIELEY 


C NGGNL YDK I 


LRQKDKLFEE 


151 


EMWWYLFQI 


VSAVSCIHKA 


GILHRDIKTL 


NI FLTKANL I 


KLGDYGLAKK 


201 


LNSEYSMAET 


LVGTPYYMSP 


ELCQGVKYNF 


KSDIWAVGCV 


IFELLTLKRT 


251 


FDATNPLNLC 


VKIVQGIRAM 


EVDSSQYSLE 


LIQMVHSCLD 


QDPEQRPTAD 


301 


ELLDRPLLRK 


RRREMEEKVT 


LLNAPTKRPR 


SSTVTEAPIA 


WTSRTSEVY 


351 


VWGGGKSTPO 


KLDVIKSGCS 


AROVCAGNTH 


FAWTVEKEL 


YTWVNMOGGT 


401 


KLHGOLGHGD 


KASYROPKHV 


EKLOGKAIHO 


VSCGDDFTVC 


VTDEGOLYAF 


451 


GSDYYGCMGV 


DKVAGPEVLE 


PMOLNFFLSN 


PVEOVSCGDN 


HWVLTRNKE 


501 


VYSWGCGEYG 


RLGLDSEEDY 


YTPOKVDVPK 


ALIIVAVOCG 


CDGTFLLTOS 


551 


GKVLACGLNE 


FNKLGLNOCM 


SGIINHEAYH 


EVPYTTSFTL 


AKOLSFYKIR 


601 


TIAPGKTHTA 


AIDERGRLLT 


FGCNKCGQLG 


VGNYKKRLG I 


NLLGGPLGGK 


651 


QVIRVSCGDE 


FT I AATDDNH 


I F AWGNGGNG 


RLAMTPTERP 


HGSDICTSWP 


701 


RPIFGSLHHV 


PDLSCRGWHT 


ILIVEKVLNS 


KTIRSNSSGL 


SIGTVFQSSS 


751 


PGGGGGGGGG 


EEEDSOOESE 


TPDPSGGFRG 


TMEADRGMEG 


LIS PTEAMGN 


801 


SNGASSSCPG 


WLRKELENAE 


FIPMPDSPSP 


LSAAFSESEK 


DTLPYEELOG 


851 


LKVASEAPLE 


HKPOVEASSP 


RLNPAVTCAG 


KGTPLTPPAC 


ACSSLOVEVE 


901 


RLOGLVLKCL 


AEQQKLQQEN 


LOIFTOLOKL 


NKKLEGGOOV 


GMHSKGTOTA 


951 


KEEMEMDPKP 


DLDSDSWCLL 


GTDSCRPSL 
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1 1 /is: 



Bicaudal D MAAEEVLQTVDHY 

SGNK MSAPSEEEEYARLVMEAQPEWL 

C-NAP1 (aa 121) NTHLEAQLQKAEEAGAELQADLRDIQEEKEEIQKKLSESRHQQEAATTQLEQLHQEAKRQ 

Bicaudal D KTEIERLTKELTETTHEKIQAAEYGLWLEEKLTIjKQQYDELEAEYDSIjKQELEQLKEAF 

SGNK RAEVKRLSHELAETTREKIQAAEYGLAVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

C-NAP1 EEVIiARAVQEKEALVREKAALEVRLQAVERDRQDIiAEQLQGLSSAKELLESSIiFEAQQQN 

Bicaudal D GQ S F S I HRKVAEDGETREETL.LQE SASKEA YYLGK I LEMQNELKQSRAWTNVQAENERL 

SGNK GQAHTNHKKVAATCESREESLIQESASKEQYYVRKVLELQTELKQLR3WLTNTQSENERL 

C-NAPl SVI EVTKGQLEVQ I QTVTQAKEV I QGEVRCLKLELDTERSQAE - QERDAAARQLAQAEQE 

Bi caudal D T AWQDLKENNEMVEIjQRI RMKDE I RE YKFREARIjLQDYTEIiEEENX TIjQKL VSTLKQNQ 

SGNK ASVAQELKEINQNVEIQRGRLRDDIKEYKPREARLLQDYSEIiEEENISLQKQVSVLRQNQ 

C-NAP1 GKTALEQQKAAHEKEVNQLREKWE-KERSWHQQELAKALESIfEREKMEIiEMRIiKEQ-QTE 

Bicaudal D VEYEGLKHEIKRFEEETVLLNSQLEDAIRLKEXAEHQIiEEALETLKNEREQKNNLRKELS 

SGNK VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKEISERQLEEALETLKTEREQKNSLRKELS 

C-NAP1 MEAIQAQREEERTQAESALCQMQLETEKERVSLLETLIiQTQKELADASQQLERLRQDMKV 

Bi caudal D QYISLND NHISISVDGLKFAEDGSEPNN — DDKMNGHIHGPLVKLNGDYRTPTLRK 

SGNK HYMSINDSFYTSHLHVSLDGLKFSDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTSTPKK 

C-NAP1 QKLKEQETTGILQTQLQEAQRELKEAARQHRDDLAALQEESSSLLQDKMDLQKQVEDLKS 

Bicaudal D GESLNPVSDLFSELNISEIQKLKQQLMQVEREKAILLANLQESQTQLEHTKGALTE 

SGNK EGLAPPSPSLVSDLLSELNISEIQKLKQQLMQMEREKAGLLATLQDTQKQLEHTRGSLSE 

C-NAP1 QLVAQDDSQRIiVEQEVQEKLRETQEYNRIQKELEREKASLTLSLMEKEQRLLVLQEADSI 

Bicaudal D QHERVHRLTEHVNAMRGLQ S SKELKAELDGEKGRDSGE EAHDYEVD I NGLE I LECKYRVA 

SGNK QQEKVTRLTENLSALRRLQASKERQTAIjDNEKDRDSHEIXjDYYEVDINGPEILACKYHVA 

C-NAP1 RQQELSALRQDMQEAQGEQKELSAQMELLRQEVKEK-EADFLAQEAQLLEELEASHITEQ 

Bicaudal D VTEVI DL KAE I KALKEK YNK S VENYTDEKAK YE S K I QMYDE Q VT S L EKTTKE SGEKMAHM 

SGNK VAEAGELREQLKALRSTHEAREAQHAEEKGRYEAEGQALTEKVSLLEKASRQDRELLARL 

C-NAP1 QLRASLWAQEAKAAQLQLRLRSTESQLEALAAEQQPGNQAQAQAQLASLYSALQQALGSV 

Bi caudal D EI^LQKMTSIANENHSTLNTAQDELVTFSEELAQLYHHVCLCNNETPNRVMLDYYRQSRV 

sGNK ElffiLKKVSDVAGETQGSLSVAQDELVTFSEELANLYHHVCMCNNETPNRVMLDYYREG — 

C-NAP1 CESRPELSGGGDSAPSVWGLEPDQNG — ARSLFKRGPLLTALSAEAVASALHKLHQDLWK 

Bicaudal D TRSGSLKGPDDPRGLLSPRLARRGVSSPVETRTSSEPVAKESTEPSKEPSPTKTPTISPV 

sGNK -QGG — AGRTSPGGRTSP — EARGRRSPILL PKGLLAPEAGRADGGTGDSSPSPG 

C-NAP1 TQQTRDVLRDQVQKLEERLTDTEAEKSQVHTELQDLQRQLSQNQEEKSKWEGKQNSLESE 

Bicaudal D ITAPPSSPVLDTSDIRKEPMNIYNLNAIIRDQIKHLQKAVDRSLQLSRQRAAARELAPMI 

SGNK SSLP — SPLSDP — RR-EPMNIYNLIAIIRDQIKHLQAAVDRTTELSRQRIASQELGPAV 

C-NAP1 LMELHETMASLQSRLRRAELQRMEAQGER E L LQ AAKENL TAQ VEHLQ AAWEARAQ 

Bicaudal D DKDKEALMEEILKLKSLLSTKREQXATLRAVLKANKQTAEVAIiANLKNKYENEKAHVTET 

SGNK DKDKEALMEEILKLKSLLSTKREQITTLRTVLKANK^ 

C-NAP1 ASAAGILEEDLRTARSALKLKNEEVESERERAQALQEQGELKVAQGKALQEN-LAIiLTQT 

Bi caudal D MTKLRNELKALKEDAATFSSLRTMFATRCDEYVTQLDEMQRQIiAAAEDEKKTLNTLLRMA 

SGNK MMKLRNELKALKEDAATFS SLRAMF ATRCDE Y I TQIiDEMQRQLAAAEDEKKTIiNSL LRMA 

C-NAP1 LAEREEEVETLRGQIQELEKQREMQKAALELLSLDLKKRNQEVDLQQEQIQELEKCRSVL 

Bicaudal D IQQKLALTQRLEDLEFDHEQSRRSKGKLG-KSKIGSPKV (-> 154 aa) 

SGNK IQQKLALTQRLELLELDHEQTRRGRAKAAPKTKPATPSL* 

C-NAP1 EHLPMAVQEREQKLTVQREQIRELEKDRETQRNVLEHQL { -> 914 aa) 



Comparison of sGNK with coiled-coil domains of Human Bicaudal D 
and the human centrosomal NEK-1 substrate protein C-Napl 



FIGURE 7 



sGNK is a substrate for GNK in vitro. 




FIGURE 8 



Final GNK purification step: 
microbore Mono Q column chromatography 




silver autokinase 
stain activity 



FIGURE 9 
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FIGURE 11 A (top) 
FIGURE 11B (bottom) 



